TritonSort: A Balanced Large-Scale Sorting System

نویسندگان

  • Alexander Rasmussen
  • George Porter
  • Michael Conley
  • Harsha V. Madhyastha
  • Radhika Niranjan Mysore
  • Alexander Pucher
  • Amin Vahdat
چکیده

We present TritonSort, a highly efficient, scalable sorting system. It is designed to process large datasets, and has been evaluated against as much as 100 TB of input data spread across 832 disks in 52 nodes at a rate of 0.916 TB/min. When evaluated against the annual Indy GraySort sorting benchmark, TritonSort is 60% better in absolute performance and has over six times the per-node efficiency of the previous record holder. In this paper, we describe the hardware and software architecture necessary to operate TritonSort at this level of efficiency. Through careful management of system resources to ensure cross-resource balance, we are able to sort data at approximately 80% of the disks’ aggregate sequential write speed. We believe the work holds a number of lessons for balanced system design and for scale-out architectures in general. While many interesting systems are able to scale linearly with additional servers, per-server performance can lag behind per-server capacity by more than an order of magnitude. Bridging the gap between high scalability and high performance would enable either significantly cheaper systems that are able to do the same work or provide the ability to address significantly larger problem sets with the same infrastructure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tritonsort 2014 1.1 Handling Arbitrary Record Sizes 1.2 Handling Data Skew

We present TritonSort, a sorting system designed to maximize system resource utilization. We present the results for: Indy GraySort, Daytona GraySort, Indy MinuteSort, Indy CloudSort, and Daytona CloudSort.

متن کامل

Reader 8 Sampler 8 Intra - Node Merger 1 Sender 1 Receiver 1 Inter - Node Merger 8 Meta - Merger 1 Partition Calculator 1 NFS Filesystem

We present TritonSort, a sorting system designed to maximize system resource utilization. We present the results for: 1) Indy GraySort and Daytona GraySort, 2) Indy MinuteSort, and 4) Indy and Daytona 10 JouleSort.

متن کامل

Energy-Efficient Fast Sorting 2011

Authors of this report participated in the Sort Benchmark contests in 2009 and 2010. In 2009, our DEMSort program took the lead in the then-new Indy Gray category [RSSK09, RSS10], sorting 100 TB on a cluster with about 200 nodes. A tie was declared with Yahoo, whose Hadoop-based program achieved about the same result in the Daytona class, but with 17 times the hardware effort. Former results in...

متن کامل

A Novel Deterministic Sampling Scheme with Applications toBroadcast - E cient Sorting on the Recon gurable Mesh

The main contribution of this work is to present a simple deterministic sampling strategy that, when used for bucket sorting, yields buckets that are remarkably well balanced, making costly balancing unnecessary. To the best of our knowledge this is the rst instance of a deterministic sampling strategy featuring this performance. Although the strategy is perfectly general, we illustrate its pow...

متن کامل

Modelling and optimization of a tri-objective Transportation-Location-Routing Problem considering route reliability: using MOGWO, MOPSO, MOWCA and NSGA-II

 In this research, a tri-objective mathematical model is proposed for the Transportation-Location-Routing problem. The model considers a three-echelon supply chain and aims to minimize total costs, maximize the minimum reliability of the traveled routes and establish a well-balanced set of routes. In order to solve the proposed model, four metaheuristic algorithms, including Multi-Objective Gre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011